Word Length Andword Frequency

نویسندگان

  • Udo Strauss
  • Peter Grzybek
  • Gabriel Altmann
چکیده

Since the appearance of Zipf’s works, (esp. Zipf 1932, 1935), his hypothesis “that the magnitude of words tends, on the whole, to stand in an inverse (not necessarily proportionate) relationship to the number of occurrences” (1935: 25) has been generally accepted. Zipf illustrated the relation between word length and frequency of word occurrence using German data, namely the frequency dictionary of Kaeding (1897–98). In the past century, Zipf’s idea has been repeatedly taken up and examined with regard to specific problems. Surveying the pertinent work associated with this hypothesis, one cannot avoid the impression that there are quite a number of problemswhich have not been solved to date.Mainly, this seems to be due to the fact that the fundamentals of the different approaches involved have not been systematically scrutinized. Some of these unsolved problems can be captured in the following points:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How Different Are Language Models andWord Clouds?

Word clouds are a summarised representation of a document’s text, similar to tag clouds which summarise the tags assigned to documents. Word clouds are similar to language models in the sense that they represent a document by its word distribution. In this paper we investigate the differences between word cloud and language modelling approaches, and specifically whether effective language model...

متن کامل

Statistical word sense aware topic models

LDA has been proved effective in modeling the semantic relation between surface words. This semantic information in the document collection is useful to measure the topic distribution for a document. In general, a surface word may significantly contribute to several topics in a document collection. LDA measures the contribution of a surface word to each topic and considers a surface word to be ...

متن کامل

Some macro quantitative features of low-frequency word classes

This contribution examines the macro quantitative features of 15 lowfrequency word classes. The relationship between word frequency classes and the sizes of the frequency classes obeys Altmann’s power law, and the sizes of lowfrequency word classes increase along with the increase of text length. The relationship between text length and the sizes of low-frequency word classes also obeys Altmann...

متن کامل

Word length and frequency effects on children’s eye movements during silent reading

In the present study we measured the eye movements of a large sample of 2nd grade German speaking children and a control group of adults during a silent reading task. To be able to directly investigate the interaction of word length and frequency effects we employed controlled sentence frames with embedded target words in an experimental design in which length and frequency were manipulated ind...

متن کامل

Lexical and sublexical components of age-related changes in neural activation during visual word identification.

Positron emission tomography data (Madden, Langley, et al., 2002) were analyzed to investigate adult age differences in the relation between neural activation and the lexical (word frequency) and sublexical (word length) components of visual word identification. The differential influence of these components on reaction time (RT) for word/nonword discrimination (lexical decision) was generally ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010